Voice activity detection in degraded speech using excitation source information

نویسندگان

  • K. Sri Rama Murty
  • Bayya Yegnanarayana
  • S. Guruprasad
چکیده

This paper proposes a method for detection of voiced regions from speech signals collected in noisy environment. The proposed method is based on the characteristics of excitation source of speech production. The degraded speech signal is processed by linear prediction analysis for deriving the linear prediction residual. Hilbert envelope of the linear prediction residual is processed using covariance analysis to obtain coherentlyadded covariance signal. The periodicity property of the coherently added covariance signal is exploited to detect the voiced regions using autocorrelation analysis. The performance of the proposed voice activity detection algorithm is evaluated under different noise environments and at different levels of degradation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)

Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...

متن کامل

Extraction of Excitation Information from Speech and Its Applications for Expressive Speech Processing

Through speech production mechanism, speech with different voice qualities such as phonations, emotions, expressive singing and other paralinguistic sounds are also produced. Most of these sounds demonstrate these features mostly due to the excitation component (vibration of the vocal folds at the glottis) whereas the dynamic vocal tract system primarily conveys the message. Hence, the excitati...

متن کامل

Audiovisual speech source separation: a regularization method based on visual voice activity detection

Audio-visual speech source separation consists in mixing visual speech processing techniques (e.g. lip parameters tracking) with source separation methods to improve and/or simplify the extraction of a speech signal from a mixture of acoustic signals. In this paper, we present a new approach to this problem: visual information is used here as a voice activity detector (VAD). Results show that, ...

متن کامل

Enhancement of speech in multispeaker environment

In this paper a method based on the excitation source information is proposed for enhancement of speech, degraded by speech from other speakers. Speech from multiple speakers is simultaneously collected over two spatially distributed microphones. Time-delay of each speaker with respect to the two microphones is estimated using the excitation source information. A weight function is derived for ...

متن کامل

Evaluation of Glottal Epoch Detection Algorithms on Different Voice Types

According to the source-filter model of speech production, speech can be represented by passing the excitation signal through the vocal tract filter. The epoch or instant of maximum excitation corresponds to the glottal closure instant. Several speech processing applications require robust epoch detection but this can be a difficult task. Although state-of-the-art epoch estimation methods can p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007